Data Organization using PySpark in Jupyter Notebook

If you are using Sample Project necessary Notebooks are already ingested in your Project.

Otherwise,

a) First ensure that you have downloaded all Notebook files available in Notebooks folder at the root of this repository.

b) Next, click Notebooks under Assets. In the resulting page click Add Notebook.

c) Go to From File tab. Upload the notebook 'Prepare and Save Data For Building a ML Model'.

Next, open the Notebook using Spark Environment. Use the highest version of Spark Environment to open the Notebook by clicking three vertical dots at the right of the notebook name). Run all cells of the Notebook following the Instructions in the Notebook.